This tutorial comes from Carson Sievert’s Plotly for R Master Class.
plotlyThe plotly package depends on ggplot2 which
bundles a data set on monthly housing sales in Texan cities acquired
from the TAMU real estate
center. After the loading the package, the data is “lazily loaded”” into
your session, so you may reference it by name:
## # A tibble: 8,602 × 9
## city year month sales volume median listings inventory date
## <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Abilene 2000 1 72 5380000 71400 701 6.3 2000
## 2 Abilene 2000 2 98 6505000 58700 746 6.6 2000.
## 3 Abilene 2000 3 130 9285000 58100 784 6.8 2000.
## 4 Abilene 2000 4 98 9730000 68600 785 6.9 2000.
## 5 Abilene 2000 5 141 10590000 67300 794 6.8 2000.
## 6 Abilene 2000 6 156 13910000 66900 780 6.6 2000.
## 7 Abilene 2000 7 152 12635000 73500 742 6.2 2000.
## 8 Abilene 2000 8 131 10710000 75000 765 6.4 2001.
## 9 Abilene 2000 9 104 7615000 64500 771 6.5 2001.
## 10 Abilene 2000 10 101 7040000 59300 764 6.6 2001.
## # ℹ 8,592 more rows
ggplot2Let’s see if there’s any pattern in house price behavior over time:
p <- txhousing %>%
group_by(city) %>%
ggplot(aes(x = date, y = median)) +
geom_line(aes(group = city), alpha = 0.2)
pIt’d be nice if we could see which city each line corresponds to when
we hover. plotly makes this easy! Just wrap your
ggplot object in the ggplotly() function:
## [1] "gg" "ggplot"
If we just want the city name, we can specify exactly what to put in the tooltip:
plot_ly()We can also build plotly objects directly using the
plot_ly() function along with dplyr-like
syntax. Why would we want to? Well, for one thing,
plot_ly() recognizes and preserves groupings created with
dplyr’s group_by() function:
library(dplyr)
tx_grouped <- group_by(txhousing, city)
# initiate a plotly object with date on x and median on y
p <- plot_ly(tx_grouped, x = ~date, y = ~median)
plotly_data(p)## # A tibble: 8,602 × 9
## city year month sales volume median listings inventory date
## <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Abilene 2000 1 72 5380000 71400 701 6.3 2000
## 2 Abilene 2000 2 98 6505000 58700 746 6.6 2000.
## 3 Abilene 2000 3 130 9285000 58100 784 6.8 2000.
## 4 Abilene 2000 4 98 9730000 68600 785 6.9 2000.
## 5 Abilene 2000 5 141 10590000 67300 794 6.8 2000.
## 6 Abilene 2000 6 156 13910000 66900 780 6.6 2000.
## 7 Abilene 2000 7 152 12635000 73500 742 6.2 2000.
## 8 Abilene 2000 8 131 10710000 75000 765 6.4 2001.
## 9 Abilene 2000 9 104 7615000 64500 771 6.5 2001.
## 10 Abilene 2000 10 101 7040000 59300 764 6.6 2001.
## # ℹ 8,592 more rows
Since we didn’t specify any mapping, the plot defaults to a scatterplot:
Let’s change that to a line chart. Similar to
geom_line() in ggplot2, the
add_lines() function connects (a group of) x/y pairs with
lines in the order of their x values and returns the transformed
plotly object:
Want to highlight a particular line? Filtering works, and since each
add_lines() call returns a pointer to the modified
plotly object, we can chain calls together with pipes:
ggplot will do…And just so you don’t think we’re limited to line charts:
Check out The Plotly Cookbook for more details on specific plotly visualization types (“traces”).
Find a new data set to practice with and create at least 2 different interactive plots.